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^ Abstract 

(N 

^_^ Suppose that f G M" is a vector of n error-contaminated measure- 

^ ments of n smooth values measured at distinct and strictly ascending 

^^ abscissae. The following projective technique is proposed for obtaining 

a vector of smooth approximations to these values. Find y minimizing 
O ||y — f lloo subject to the constraints that the second order consecutive 

divided differences of the components of y change sign at most q times. 
This optimization problem (which is also of general geometrical inter- 
est) does not suffer from the disadvantage of the existence of purely 
local minima and allows a solution to be constructed in 0{nq) opera- 
lO tions. A new algorithm for doing this is developed and its effectiveness 

^^ is proved. Some of the results of applying it to undulating and peaky 

^^ data are presented, showing that it is economical and can give very 

Q good results, particularly for large densely-packed data, even when 

1— I the errors are quite large. 
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^ 1. Introduction 



Given a set of observations /j, at strictly ascending abscissae Xj, for 1 < 
j < n, it may be known that the observations represent measurements of 
smooth quantities contaminated by errors. A method is then needed for 
obtaining a smooth set of points while respecting the observations as much 
as possible. One method is to make the least change to the observations, 
measured by a suitable norm, in order to achieve a prescribed definition 
of smoothness. The data smoothing method of Cullinan & Powell (1982) 
proposes defining smoothness as the consecutive divided differences of the 
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points of a prescribed order r having at most a prescribed number q of sign 
changes. There are many good reasons for using the number of sign changes 
in the divided differences of data as a criterion of smoothness. Normally 
data values of a smooth function will have very few sign changes, whereas 
if even one error is introduced, it will typically cause k sign changes in the 
A;th order divided differences of the contaminated data. Thus constructing 
a table of divided differences is a cheap and sensitive test for smoothness 
(see, for example, Hildebrand, 1956). If the observations fj, for 1 < j < n, 
are regarded as the components of a vector f G IR" and the function F : 
IR" — )■ R is defined through the chosen norm by -F(v) = ||v — f ||, then the 
data smoothing problem becomes the constrained minimization of F. This 
approach has several advantages. There is no need to choose (more or less 
arbitrarily) a set of approximating functions, indeed the data are treated 
as the set of finite points which they are rather than as coming from any 
underlying function. The method is projective or invariant in the sense that 
it leaves smoothed points unaltered. It depends on two integer parameters 
which will usually take only a small range of possible values, rather than 
requiring too arbitrary a choice of parameters. It may be possible to choose 
likely values of q and r by inspection of the data. The choice of norm can 
sometimes be suggested by the kind of errors expected, if this is known. For 
example the ii norm is a good choice if a few very large errors are expected, 
whereas the ioo norm might be expected to deal well with a large number of 
small errors. There is also the possibility that the algorithms to implement 
the method may be very fast. 

The main difficulty in implementing this method is that when q > 1, 
the possible existence of purely local minima of F makes the construction of 
an efficient algorithm very difficult. This has been done for the £2 norm for 
r < 2 — see for example Demetriou (2002). The author dealt with the case 
g = and arbitrary r for the £2 norm (Cullinan, 1990). 

It was claimed in Cullinan & Powell (1982) that when the £00 norm is 
chosen and r = 2, all the local minima of F are global and a best approxi- 
mation can be constructed in 0{nq) operations; and an algorithm for doing 
this was outlined. These claims were proved by Cullinan (1986) which also 
considered the case r = 1. It was shown that in these cases the minimum 
value of F is determined by g + r + 1 of the data, and a modified algorithm 
for the case r = 2 was developed which is believed to be better than that 
outlined in Cullinan & Powell (1982). This new algorithm will now be pre- 
sented in Section 2 and its effectiveness will be proved. Section 3 will then 
describe the results of some tests of this method which show that it is a very 
cheap way of filtering noise but can be prone to end errors. 
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2. The algorithm 

This section will construct a best i^o approximation to a vector f G IR" with 
not more than q sign changes in its second divided differences. More precisely 
let V G IR" with xi < X2 < ■ ■ ■ < Xn and 

F(v) = ||v-f|U (2.1) 

1 ( Vk- Vj Vj 



c^,M = ^^ ^^^^3^ - ^^^3^ , (2.2) 

q(v) = Q,i+i,i+2(v). (2.3) 

The feasible set of points Yq C IR" is defined as the set of all vectors 
V G R" for which the signs of the successive elements of the sequence 
l,ci(v), . . . , c„_2(v) change at most q times, and the the problem is then 
to develop an algorithm to minimize F over Yq. 

The solution depends on the fact that the value h^ of the best approx- 
imation is determined by g + 3 of the data. Since a best l^o approximation 
is not unique, there is some choice of which one to construct. The one cho- 
sen, y, has the following property: y\ = /i, |/„ = /„, and for any j with 
2 < j < n - 1, 

if ± 9_i(y) > then yj = fj ± hg 

The vector y is then determined from hq, from the set of indices i where 
Ci-i(y) 7^ 0, and from the ranges where the divided differences do not change 
sign. 

The method by which the best approximation is constructed and the 
proof of the effectiveness of the algorithm that constructs it are best un- 
derstood by first considering the cases g = 0, 1, and 2 in detail and giving 
algorithms for the construction of a best approximation in each case. Once 
this has been done it is easy to understand the general algorithm. 

When q = 0, this best approximation is formed from the ordinates of the 
points on the lower part of the boundary of the convex hull of the points 
{xj,fj), for I < j < n, (the graph of the data in the plane) by increasing 
these ordinates by an amount h. 

When q = 1, there exist integers s and t such that 1 < s < t < n, and 
yi, . . . ,ys are the ordinates on the lower part of the boundary of the convex 
hull of the data /i, . . . , /^ increased an amount h; yt, . . . , yn are ordinates 
on the upper part of the boundary of the convex hull of the data /j, . . . , /„ 
decreased an amount h; and if s < j < t, yj lies on the straight line joining 
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{xs,ys) to {xt,yt)- The best approximation therefore consists of a convex 
piece and a concave piece joined where necessary by a straight hne. 

The best approximation over Yg consists of g + 1 ahernately raised pieces 
of lower boundaries of convex hulls and lowered pieces of upper boundaries 
of concave hulls joined where necessary by straight lines. These pieces are 
built up recursively from those of the best approximation over Yq^2- 

The points on the upper or lower part of the boundary of the convex hull 
of the graph of a range of the data each lie on a convex polygon and are 
determined from its vertices. The algorithms to be described construct sets 
of the indices of these vertices and the value of a best approximation. The 
best approximation vector is then constructed by linear interpolation. 

Before considering the cases q = 0, 1, and 2, an important preliminary 
result will be established. It is a tool that helps to show that the vectors 
constructed by the algorithms are optimal. The value hg of the best approxi- 
mation over Yg will be found by the algorithm, together with a vector y EYg 
such that -F(y) = hg. To show that y is optimal, a. set K oi q + 3 indices 
will be constructed such that if v is any vector in IR" such that F(y) < hg, 
then the consecutive second divided differences of the components Vk, for 
k & K, change sign q times starting with a negative one. It will then be 
inferred that \ ^ Yg. In order to make this inference it must be shown that 
the consecutive divided differences of all the components of v have at least 
as many sign changes as those of the components with indices in K. This 
result will now be proved. 

Theorem 1 Let K (1 [1, . . . ,n} and let v G iR" be any vector such that the 
second divided differences of the Vk, for k E K , change sign q times. Then 
the divided differences of all the components of\ change sign at least q times. 

Proof Firstly, suppose that K is formed by deleting one element j from 
{l,...,n} and that 3 < j < n — 2. Let Cj_2, Cj_i, Cj be defined from v 
by ([2^ and ([23|), and let c^_2 = Cj^2,j^i,j+i{y) and c'^_^ = Cj_ij+ij+2(v) 



denote the new divided differences that result from deleting j. Manipulation 



of (2.2) yields the equations 



and 



Cj_2 = Cj-2 H Cj-i 

Xj+i — Xj^2 ^j+l '~ ^j-2 



Cj_i = Cj H '^j-l; 

^j+2 -^j—l ^j+2 -^j—l 
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so that c'j_2 lies between Cj-2 and Cj_i and c'j_i lies between Cj-i and Cj. It 
follows that the number of sign changes in the sequence 

■ ■ ■ ; Cj_3, Cj-2, Cj_25 Cj_i, Cj_i, Cj, Cj^i, ■ ■ ■ , l^-^j 

is the same as that in the sequence 



and hence that deleting Cj_2,Cj_i,Cj from (2.4) cannot increase the number 
of sign changes. The same argument covers the cases j = 2 and j = n — 1, 
and when j = 1 or j = n this result is immediate. 

Repeated application of this result as elements j of the set { 1, . . . , n }\i^ 
are successively deleted from { 1, . . . , n } then proves the theorem. □ 

Corollary 1 Ifi<k — 2 and all the divided differences Cj{\),i < j < k, are 
non-negative (or non-positive), and ifi<r<s<t<k, then Crst{^) is also 
non-negative (or non-positive) . 

This Theorem is crucial to the effectiveness of the algorithm. In Cullinan 
(1986) it was proved that the set of best approximations is connected, so that 
purely local minima are ruled out, but that this is not the case for higher 
orders of divided differences. This Theorem allows the explicit construction 
of global minima determined by g + 3 of the data (and so it does not seem 
necessary to prove connectedness). There is no analogous result for higher 
order divided differences, and so no ready generalization of the methods of 
this paper to such cases. 

2.1 The case q = 

When q = 0, the required solution is a best convex approximation to f . The 
particular one, y", that will be constructed here was first produced by Ubhaya 
(1979). It will also be convenient to construct a best concave approximation 
to data. 

The convex approximation is determined from the vertices of the lower 
part of the boundary of the convex hull of the points, and the concave approx- 
imation from the vertices of the upper part. Each of these sets of indices can 
be constructed in 0{n) operations by the following algorithm. It is convenient 
to apply the construction to a general range of the data and to describe it in 
terms of sets of indices. Accordingly, define a range [r,s] = {j:r<j<s}, 
and a vertex set of [r, s] to be any set / such that {r, s} C / C [r, s]. Given a 
vertex set / of a range [r, s] and also quantities Vi, i E I, define the gradients 

gik{^) = — - — for I 7^ k. 

X]^ Xi 
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Given any index j '■ 1 < j < n, define the neighbours of j in / by 
j^{I) = minji > j} wlien j < s 
j^{I) = maxjz < j} wlien j > r, 

and (for purposes of extrapolation) 

j^(/) = s~{I) when j > s 

]"{!) = r~^{I) when j < r. 

The interpolant v(/) can now be defined by 

Vj{I) = Vj ioTjel (2.5) 

^i(^) = ^j-ii) + 9j-ii)j+(i){i){xj - Xj-(/)) for J ^ /. (2.6) 

When / = {p, q} it is convenient to write Vj{I) as Vj{p, q) etc. 

The two cases of convex and concave approximations are handled using 
the sign variable a, where a = + for the convex case and a = — for the con- 
cave case. The convex and concave optimal vertex sets I~^{r,s) and I~{r,s) 
are then constructed by systematic deletion as follows. 

Algorithm 1 To find I"{r^ s) when r < s — 2. 

Step 1. Set / := [r, s] and i := r, j := r + 1, k := r + 2. 

Step 2. Evaluate c := Cjjfc(f). If ac > 0: go to Step 5. 

Step 3. Delete j from I. li i = s: go to Step 5. 

Step 4. Set k := j, j := i, and i := i^{I)- Go to Step 2. 

Step 5. li k = t: set I''{r, s) = I and stop. 

Otherwise: set i := j, j := k, and k := k~^{I). Go to 
Step 2. 

The pnce of making a convex/concave approximation in the range [r, s] 
is given by 

h'^ir, s) = \ max a{f, - f,{r{r, s)), (2.7) 

je[r,s] 

and the required best approximation y*^ is then defined by 

yO = /,(/) + /i, (2.8) 

where / = /^(l, n) and h = h^{l, n). 

The construction of y° is illustrated in Figure 1. Fig 101 



Theorem 2 Let n > 3, and let Hq = /i+(l,n) and y^ be defined from (2.1) 
and {2^ using Algorithm 1, |g^ and {2^ . Then F(y°) = Hq = inf F(Fo) 
and y° G Fo- 
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/. 



fAn fkAij 



Figure 1: Consecutive elements k , k, k^ oi I = /^(l, n) and the construction 
of/,(/) 



Proof The first remark is that Algorithm 1 produces a well-defined vertex 
set / of [r, s] from which the quantities fj{I) are also well-defined for all j, 
so that ho and y° are also well-defined. 

The proof that the points {xj,fj{I)), I < j < n, lie on the lower part 
of the convex hull of the data is in Ubhaya. The components fj{I) can be 
defined as those that are maximal subject to the inequalities 



fj{I)<f„ ioil<j<n, 



and 



Q(f(/)) > 0, 



for 1 < i < n — 2, 



(2.9) 



(2.10) 



i.e. f(/) E Yq. It follows that y° G Yq and from ^J^ and (|2^ that F(y°) = 
h. 

It remains to prove that y° is optimal, li h = 0, y^ must be optimal. If 
h > 0, there will be a lowest integer j* ^ / such that equality is attained 



in (2.7), and since 1 and n can never be deleted from /, it must be the case 



that 2<f<n-l. Then 



fj* — ho, 



and k = j (/) and k^ = j^{I) are consecutive elements of I such that 
k < j < k^ and 

yl = fk + ho, 
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yk+ = fk+ + ho- 

Since f ^ I, Ck,j*,k+iy° ) = Cfcj*,fc+(f(/)) = 0. Now if v e ]R" and F(v) < h, 
then 

Vk< fk + h = yl, 

Vk+ < fk+ + h = yl+. 

Now the constraint function Ckj*^k+ is an increasing function of Vk and 
Vk+ and a decreasing function of Vj*, so Ckj*,k+(y) < Ckj*^k+iy^) = 0. It 
now follows from Theorem 1 that \ ^ Yq, and thus that h = h^ and y*^ is 
optimal. □ 

Corollary 2 The components of a best concave approximation to data fj, 
"T < j < s, are given from Algorithm 1 with o = — by yj = fj{I) — h, where 
in this case 1 = 1" (r, s) and h = h~ (r, s) . 

The algorithms in the next subsections will join optimal vertex sets 
produced by Algorithm 1 of consecutive ranges of the data, and they will 
also cut such optimal vertex sets in two. It is convenient to prove here that 
the resulting sets remain optimal vertex sets. The proof requires one further 
property of the optimal vertex sets produced by Ubhaya's algorithm. 

This algorithm is a specialization of an algorithm by Graham (1972) for 
finding the convex hull of an unordered set of points in the plane. It is 
logically equivalent to the algorithm of Kruskal (1964) for monotonic ap- 
proximation, which is more efficient than the algorithm of Miles (1959) for 
monotonic approximation, but produces the same results. Two conclusions 
follow from this. The first, which will be needed later, is that if i and k are 
consecutive elements of /""(r, s) then 



(^Qik 
crgik 



i,(f(/)) = mm{ag,,iiil)) : t<j<s} and (2.11) 

^f(/)) = mm{ag,kiiil)) ■■r<j<t}. (2.12) 



The second, which is of some theoretical interest, is that the gradients 
5'jj_i_i(f(/)), for 1 < j < n — 1, are the best least squares approximations 
to the numbers gjj^i{f), for 1 < j < n — 1, subject to to the constraint 
that they monotonically increase, i.e. to convexity of the points of which 
they are gradients. The proof of this interesting equivalence is a straightfor- 
ward consequence of the equivalence of Kruskal's and Miles's algorithms for 
monotonic approximation. It is given in Cullinan (1986). 
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The results for the joining and sphtting of optimal vertex sets require 
definition of trivial optimal vertex sets by 



r{r,s) = [r,s] 



when s — 1 < r < s. 



(2.13) 



It is also convenient to define h'^{r,s) = when r > s — 1. The following 
lemmas then hold. 

Lemma 1 A subset of one or more consecutive elements of an optimal vertex 
set is itself an optimal vertex set. 



Proof The proof is trivial unless the subset has at least four elements, when 



it follows from the nature of extreme points of convex sets using (2.11) and 



(2.12). D 



The second lemma gives conditions for the amalgamation of optimal 

vertex sets. 

Lemma 2 Given vertex sets I'^ir^s) and I'^{s,t) necessary and sufficient 
conditions for 

r{r,t) = r{r,s)ur{s,t) (2.14) 

are that there exist r' < r, r' & I"{r., s), and t' >t,t'E I^{s.,t), such that 



serir'A' 



(2.15) 



Proof The proof is trivial except for the sufficiency of (2.15) when r < s < 



t. Let s be the left neighbour of s in I"{r, s) and s+ its right neighbour in 



I''{s,t). It follows from (2.11) and (2.12) that (2.14) holds provided that 



crc^-,s,s+(f) > 0. 



However, it follows from (2.15) that acisj{i) > for all i and j in the range 
r'<i<s<j<t', and this range contains s~ and s^. □ 



2.2 The case q=l 

The algorithm to be presented constructs ranges [1, s] and [t, n], where s < t, 
a price h > 0, and a vertex set I of [1, n] such that 



I = I+{l,s)ur{t,n). 
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Figure 2: Construction of best convex-concave approximation 

The best approximation y^ is then given by the final value of the vector 
y defined by 



Vi -- 


= h + h 


when i G /^(l,s) 


(2.16) 


Vi -- 


= U-h 


when i G I^{t,n) 


(2.17) 


Vj -- 


= vAi) 


l<j<n 


(2.18) 



It will be shown that s = t only if /i = 0, so that y^ is well defined. This 
construction is illustrated in Figure 2. Note that when s = t = n, y = y^. 

The algorithm to be given is believed to be more efficient than that in 
Cullinan & Powell (1982). For example if the data are in Yq, the new al- 
gorithm will require only iteration, whereas the former one only has this 
property if the data lie on a straight line. 

The algorithm builds up / by looking alternately at the left and right 
ranges. Beginning with / = {l,n}, s = 1, and t = n, it adds an index k of 
I~^{s,t) to / if the least possible final value of F consistent with doing this 
is not greater than the least possible value of F consistent with ending the 
calculation with the existing value of s. After adding one index in I^{s,t) 
to / and increasing s to the value of this index, it then examines the next. 
When it is not worth adding any more indices from I~^{s,t) to /, it tries to 
add indices in I^{s,t) to / working backwards from t, adding /c to / if it is 
not necessarily more expensive to finish with t reduced to k than with t as 
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it is and decreasing t. After indices have been added to / from I^{s,t) it 
may then be possible to add more to / from the new I^{s, t), so the process 
alternates between I^{s,t) and I~{s,t) until s equals t or t — 1 or until the 
algorithm fails twice running to add any indices. 

Algorithm 2 To find a best convex- concave approximation when n >2. 

Step 1. Set s := 1, t := n, h := 0, and I := {l,n}. 

Step 2. If s > t — 1: stop. Otherwise: set u = t — s and /' := 

J+(s,t). 

Step 3. Let s~^ = s~^{r). Find h~^{s, s+) and set h' := max(/i, h~^{s, s+)). 
liftis,s+)>ft-2h': go to Step 5. 

Step 4. Add s'^ to / and delete s from /'. Set h := h' and s := s~^. 
If s < t — 1: go back to Step 3. 

Step 5. If s = t: stop. Otherwise: set /' := I^{s,t). 

Step 6. Let t~ = t^(/'). Find h~ {t~ ,t) and set h' := Taax(h,h~{t~ ,t)). 
If fsit-,t) < fs + 2h': go to Step 8. 

Step 7. Add t^ to / and delete t from /'. Set h := h' and t := t~. 
li s < t: go back to Step 6. 

Step 8. li t — s = u: stop. Otherwise: go back to Step 2. 

The optimal vertex sets /' are found using Algorithm 1 or Corollary 2.1. 



The prices h'^{s, t) are found from (2.7) and ft{s, s'^) and fs{t , t) from (2.6) 



The first remark is that s is non-decreasing, t is non-increasing, and s <t 
with s = t only when h = 0. The vector y^ is therefore well-defined by 



(|2.16|)-(|2-18|). 

The possibility that Algorithm 1 can end with s = t will involve slightly 
more complexity when this Algorithm is used later on when q > 3. It might 
be prevented by relaxing either of the inequalities in Steps 3and 6. However 
if both these inequalities are relaxed then the Algorithm will fail. For example 
with equally spaced data and f = ( 0, 1, 0, 1), relaxing both inequalities yields 
s = 1, t = 4, and F{y) = |, whereas the Algorithm as it stands correctly 
calculates h = ^ and y= (^,|,^,1). It seemed best to let both inequalities 
be strict for reasons of symmetry. 

The next result concerns the conditions under which the Algorithm in- 
creases s and decreases t. 
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Figure 3: The parallelogram n(/i) 

Lemma 3 At any entry to Step 2 of Algorithm 2, let the vector y be defined 
by (2.16)-(2.18). Then the Algorithm strictly decreases t — s if and only if 

there exists j : s < j < t and \fj — yj\ > h. (2.19) 

(This somewhat cumbersome statement is needed to include the case where 
h = Q.) 

Since the points 



[X 



J' 



Uj), s < j < t, are collinear, (2.19) is equivalent 



to the statement that there exists a point of the graph of the data between 
Xs and Xt lying on or outside the parallelogram Il{h) with vertices {xs,fs), 
{xs, fs + 2/i), {xt, ft), and {xt, ft — 2/i). This parallelogram Il{h) is illustrated 
in Figure 3. 



Proof In the trivial case when s >t — 1, (2.19) is false and the Algorithm 
stops without altering s or t. 

Otherwise, there are one or more data points fj with s < j <t. Suppose 
first that ( |2T9| does not hold. Then h > 0. {li h = then ( |2l9| holds 
trivially!) It will be shown that in this case both s and t are left unchanged. 
Step 3 will calculate an index s^ : s < s^ < t. 

If s'^ = t, then because h' > h > 0, it immediately follows that ft{s, s~^) = 
ft > ft ~ 2/i', so that Step 3 will lead immediately to Step 5 and s will not 
be increased. 

If s^ < t, let ys+ be defined by (2.16)-(2.18) and let g = gstiy)- Then by 
hypothesis, fs+ > ys+ — h, which implies that 5'ss+(f) > g- Thus, 



ft{s, 



fs+gss+{i){Xt-Xs) 
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> fs + g{xt - Xs) 

> ft-2h'. 



yt - h = ft - 2h 



Therefore Step 3 will again lead immediately to Step 5. 

The same arguments applied to t~ calculated by Step 6 show that Step 6 
will branch immediately to Step 8 and so t will also be left unchanged. 



Now suppose conversely that (2.19) does hold, so that there is an index 



j with data point lying on or outside 11. It will be shown that in this case if 
the algorithm does not increase s, it must then decrease t. 

Consider first the case where fj < yj — h. Step 3 will calculate an index 
s"*" in the range s < s^ <t, and, from (2.11), with gss+i^) ^ 9sj{i)- Then 



ft{s,s' 



< 



< 



fs + gss+{i){xt-x,) 

fs + gsjii)iXt - Xs) 

fs + iifj - fs)/iXj - Xs))iXt - Xs) 



(2.20) 

(2.21) 
(2.22) 



fs + {{yj -h- fs)/{xj - Xs)){xt - Xs) by hypothesi^.23) 



ys-h + {{yj - ys)/{xj - Xs)){xt 
yt — h, by the definition of yj. 



X a 



(2.24) 
(2.25) 



It now follows immediately that when h' = h, Step 4 will be entered and s 
increased to s+, as required. 

Now suppose that h' = h~^{s, s^) > h and that s is not increased, i.e. that 
the test in Step 3 leads to Step 5. Then ft{s, s~^) > ft — 2h' = ft — 2h~^{s, s~^). 
Let h^ = h^{s, s+). By definition of h'^, there must be an index i such that 
fi ~ fi{SjS^) = 2h^. This case is illustrated in Fig. 4. It demonstrates the 
heart of the principle behind the algorithm, because the four data points 
with indices s, i, s"*", and t determine a lower bound on inf F(Yi). 

Define 

2d = ft-ft{s,s+). (2.26) 

Then by the hypothesis that Step 3 led to Step 5, 

h+ > d. (2.27) 

Step 5 will be entered with h < h^ and will calculate I~{s,t). Step 6 will 



then find an index t : s < t < t. It follows from (2.11) that fi < fi{t ,t), 
and so 

2h+ = f,- f,{s, s+) < f,{r,t) - f,{s, s+). (2.28) 



This inequality along with (2.26) and (2.27) then show that the function 



^ '-^ fi{t , t) — fi{s, s"*") is strictly decreasing. 
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Figure 4: When s does not increase, t must decrease 



If h does not increase in Step 7, it follows immediately, from the assump- 
tion that h < /i+, that /s(t",t) - /^(s, s+) = /s(t",t) - fs > 2/i+ > 2/i, so 
that t will be reduced to t^ by the test in Step 6. 

If, on the other hand, h does increase in Step 7, its new value h' is given 
by 2h' = fk{i~ ^t) — fk for some k in the range s < k <t. But the definition of 
s+ implies that fk > fk{s, s+), and so 2h' < fk(t',t) - fk{s, s+) < fs{t~,t) - 
fs{s, s+), so that again t must be reduced to t~ by Step 7. In fact it is easy to 
show that in this case, t~ > i, /i' < /i+, and that infF(Yi) > \{fs+{ht) — fs+)- 

Thus t will be reduced to t^ in all cases. 

In the case where there exists j such that /,■ > yj + h, if s is not increased 
immediately after the next entry to Step 3, the same argument shows that 
either t must be reduced in the next operation of Steps 6 and 7, or s must 
be increased immediately thereafter. □ 



{2. IS). Then 



Lemma 4 At any exit from Step 4 or Step 7, let y be defined by (2.16)- 

whens>l, c,_i(y) > 0, (2.29) 

when t < n, Ci_i(y) < 0. (2.30) 



and 



Proof Suppose first that Step 3 is entered. Then (2.29) is equivalent to 

c.+-i(y) > 0, (2.31) 
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which is in turn equivalent to the identity 

ft-ft{s,s^)>2h'. (2.32) 

But this is simply the test leading to Step 4. 

lit < n, there will exist an index t+ = t+(/), and (2.30) will be equivalent 
to the inequality 

/,+ (t,t+)-/,+ >2/i'. (2.33) 



Suppose firstly that h' > h, so that there exists i such that s < i < s^ 



and 



fi-Ms,s+) = 2h'. (2.34) 

The earlier definition of t from t+ implies that fi < fi(t, t^) and it follows 



from (2.34) and (2.32) that the monotonic function (f) : I h^ fi{t,t^) — fi{s, s^) 
satisfies (j)(i) > 2h' and 0(t) > 2h'. It follows that 0(s+) > 2h', which 



is equivalent to (2.33). It follows that whenever h increases, both the join 



constraints c^-i and Ct-i (when defined) are feasible. 

When Step 4 is entered and h does not increase, if Ct-i is initially non- 
positive, which is equivalent to the inequality 



fsit,n-f,>2h, 



(2.35) 



it follows from this and (2.32) with h' = h that 0(s) > 2h and 0(t) > 2h so 



that 0(s^) > 2h as required. 

Similarly when Step 7 is entered and h does not increase, if Cg-i is initially 
non-negative, it will remain so when s is increased. 

The result then follows by induction. □ 

Corollary 3 // steps 2 to 8 of A lgorit hm 2 are executed with h initially set 
to any number h > 0, then (2.29) and (2.30) remain satisfied. 



The effectiveness of Algorithm 2 will now be established. 

Theorem 3 Algorithm 2 produces integers s and t with s < t, an index set 
I such that 

I = I^{l,s)ur{t,n), (2.36) 

and a real number h such that h = if and only if s = t. Ify^ is then defined 



by (2.16)-(2.18), then F{y^) = h = inf F(yi) and y^ G Fi. 
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Proof The first stage is to establish (2.36). Assume inductively that it 



holds before a series of sections is added, and without loss of generality that 
a series of convex sections with indices { s, Si, . . . , Sq, } are added by successive 
entries to Step 4 for the same value of t. Each time an index is deleted from 
I' it follows immediately from Lemma 1 that the new value of /' is also an 
optimal index set, so it always holds that /' = /+(s,t). It also follows from 
Lemma 1 that { s, Si, . . . , s^ } = I^{s, Sa)- Let Ii = {i E I : i < s}. After all 
the sections are added, Ji = /^(l, s) U /^(s, Sq,). If s = 1, it follows at once 
from Lemma 2 that /i = /^(l, s^) as required. Otherwise, let t' be the value 
that t had when s was increased from s~ 
after s was increased from s~ , s E I~^{s~ 
are therefore satisfied and Ji = I~^(l,Sa 



Then Sa < t < t' . Immediately 

,t'). The conditions of Lemma 2 

The same argument applied to 



concave sections then establishes (2.36) 



The algorithm clearly produces a number h such that s = t only if /i = 0. 
It is a consequence of Lemma 3 that if s < t — 1 and /i = 0, then the algorithm 
will reduce t — s. Ifs = t — 1 and /i = 0, Step 4 will increase s to t. Thus 
/i = if and only if s = t. Thus y^ is well defined by (2.16)-(2.18). The 
number h is given by h 



max 



{h{i),h(^2)) where 



(1) 



(2) 



h+il,s), 
h~{t,n). 



(2.37) 
(2.38) 



It follows from (2.36) and Lemma 3 that when the algorithm terminates, 

F{y') = h. 



(2.39) 



It follows directly from (2.36), (2.16)-(2.18) and Lemma 4 that y^ G Fi. 



It remains to prove that y^ is optimal. The method of proof chosen to do 
this can be simplified in this case, but generalizes more directly to the case 
of g > 1. li h = 0, optimality follows immediately from (2.39). Otherwise 
suppose that there is a vector v such that -F(v) 



< h. 



The price h will be 
defined from (2.7) with particular values of a, r, and s. Let j* be the lowest 
value of j in this equation that defines the final value of h. Then j* lies 
strictly between two neighbouring elements k and k~^ of /. Assume firstly 
that j* < s. Define the set K = {k,j*,k~^,s,t}. Since h > 0, s < t, so K 
has at least four elements (it is possible that k~^ = s). Now 



yl - 


= fk + h, 


y]^ - 


= fr-h, 


yl. - 


= fk+ + h 


yl - 


= fs + K 


yl - 


= ft-h. 
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Since -F(v) < h, 



Vk 


< 


fk + h, 


Vj. 


> 


/.- - h, 


Vk+ 


< 


fk+ + h 


Vs 


< 


fs + h, 


Vt 


> 


ft-h. 



By definition of yj* , 

Ckj*k+{y^) = 0, 

and so Ckj*k+(y) < Ckj*k+{y^) = 0. From Lemma 4, Cj_i(y-'-) > for all i in 
the range j* < i <t — 1. It follows from the corollary to Theorem 1 that 

c^stiy^) > 0. 

Then Cj*st{'v) > Cj*si(y"'") > 0. If k^ = s, Theorem 1 can be immediately 
applied to K to show that v ^ Yi. If A;+ < s, it follows from the corollary to 
Theorem 1 that because Cj*st(v) > 0, at least one of the consecutive divided 
differences Cj*k+s(y) and c^+stiy) must be positive, so that again v ^ Yi. 

If j* > s, let K = { s, t, k,j*, k'^ }. Then the same argument shows that 
V cannot be feasible. Therefore y^ is optimal. □ 

2.3 The Case q = 2 

The best Yi approximation constructed in Section 2.2 was defined by equa- 



tions (2.16)-(2.18) from the pieces [l,s] and [t,n\, the index set /, and the 
price h. The best Y2 approximation will in general be constructed from three 
pieces Pi = [l,Si], P2 = [^1,^2], -P3 = [^2,''^], a price h>0 with h > only 
when Si < ti and S2 < ^2, and an index set I of [1, n] such that 

/ = /+(Pi) U /-(P2) U /+(P3), (2.40) 

as the ultimate value y^ of the vector y defined by the equations 

y. = /. + {-y-^h when ielnPa, 1 < a < 3, (2.41) 

y, = y,il) l<j<n. (2.42) 

The construction of y is illustrated in Figure 5. 

The Algorithm will construct this best Y2 approximation from the quan- 
tities ho = h~^{l,n) and I'^{l,n) constructed by Algorithm 1. If the value 
ho of this approximation is zero, the best approximation over Yq is also a 
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Figure 5: Construction of Y2 approximation 



best approximation over Y2. Otherwise, /iq > is determined by three data 
fk, fj*, and fk+ such that k and k~^ are consecutive elements of /+(l,n) and 
k < j* < k~^ . The discussion in Section 2.1 shows that unless the divided dif- 
ferences of y change sign at least once in the range [x^, Xk+ ], then F{y) > h^. 
The set /~(ti,S2) is therefore put in this range. The Algorithm begins with 
si = k, ti = S2= j*, and ^2 = k^. It then sets h = max(/i+(l, si), h'^(t2, n)). 
Next, it uses Algorithm 2 modified to calculate a best convex-concave ap- 
proximation to the data with indices in [l,j*] consistent with paying this 
minimum price h. It is an important feature of the problem that this can 
be done by starting Algorithm 2 with s = k and t = j*, i.e. the existing 
elements of I^{l,j*) below k can be kept in place. 

This process can increase si beyond k, reduce ti below j*, and increase h. 
Let h^^^ be its new value. A best concave-convex approximation to fj*,..., /„ 
starting with h = h^^"* is next identified by applying a modified version of 
Algorithm 2 with s = j* and t = k^, in general increasing S2 beyond j* and 
reducing t2 below k~^. If this second calculation does not increase h abov e 
h^^\ the best approximation y can be constructed immediately from (2.41)- 



(2.42). If, however, h > h^^\ there is the complication that the lower value of 



h when the first calculation took place may have joined too many sections for 
the first join constraints Cs^_i(y) and Ct^_i(y) determined by the new higher 
value of h to have the right signs. In this case the remedy proposed is to 
repeat the first calculation starting with the new value of h. 

The following algorithms will therefore require a modified version of Algo- 
rithm 2 to carry out a best convex-concave approximation or a best concave- 
convex approximation on a range of data, starting with a prescribed value 
of h. This task is best carried out by modifying Algorithm 2 in detail, but 
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the following description is equivalent and simpler. The following procedure 
calculates a best approximation to the data in the range [s^jto,] compatible 
with an existing price h, the approximation being over Yi or Y_i according 
as a is odd or even. It can be seen as trying to close the join [sa,tQ,] as 
much as possible by constructing the best approximation to this range of 
data compatible with the given starting value of h. 

The notation for the pieces is is based on the observation that when h > 0, 
ta = s^(/) and when h = 0, t^ = Sa, so that the location of each of the two 
joins [si,ti] and [52,^2] can be specified simply by consecutive elements of /. 
Thus the location of the pieces and joins can be specified simply through the 
quantities Sq,, 1 < a < 2. It is convenient to regard these as members of an 
ordered subset 5* of /, and to include n in S. Given an index set /, define a 
piece set S oi I to be an ordered subset of / such that n E S. Then Sa will 
denote the ath element of 5*. 

Algorithm 3 closejoin(a) 

Step 1. If a is even: replace f by — f. 

Set s = Sa and t = s'^^I). 

Step 2. Carry out Steps 2 to 8 of Algorithm 2. 

Step 3. Set Sa = s. 

If a is even: replace f by — f . 

Note that this procedure modifies h, I, and S. 

The algorithm constructs y^ by calculating the appropriate pieces and a 
price h{S) from which y^ is constructed. The notation used for keeping track 
of the pieces needs to cover the case of pieces that consist of only one point, 
and to generalize easily when q > 2. It also has to cope with the trivial 
cases where f ^Yq ot f ^Yi and there are, therefore, only one or two pieces 
instead of three. 

Given a piece set S, define its price h(S) as follows. Let q' = \S\ — 1, 
to = 1, and when q' > 1 define ta = s'^{I) when 1 < a < q'. Then 

h{S) = max{ h^-^'^'' (t«_i, Sa) : 1 < a < g' + 1 }• (2.43) 

Recall that h^{t, s) = whenever s >t — 1. 

It is also worth recording the location of the data points determining the 



optimal value of h. Given a piece set S, when h{S) > it follows from (2.43) 



and (2.7) that there exists a lowest index j*{S) such that 

h{S) = la{f).-f,.{k,k-')), (2.44) 
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where a = (-)^~\ 1 < /3 < g' + 1, and 

t/3-i <k<f<k+< 3(3. (2.45) 

Once j* is known, the quantities k, k~^,a.nd /3 are uniquely determined by 



( |2.44D and ( |2.45D . 

The following algorithm then constructs h, I and S determining y^. 

Algorithm 4 To find a best approximation over Y2 . 

Step 1. Set 5* = {n}. Use Algorithm 1 to calculate / = /+(l,n) 
and h = h'^{l,n). 
li h = 0: stop. 



Otherwise: find j*, k, k~^ determined from S by (2.44) and 
proceed to Step 2. 

Step 2. Insert j* into / and k, j* into S. Calculate h = h{S). 

Step 3. Apply closejoin{l). Set h^^^ = h. 

Step 4. Apply closejoin{2). li h = h'^^'^: stop. 
Otherwise proceed to Step 5. 

Step 5. If /i(i) = 0: go to Step 6. 

If Si = A; and sf{I) = j*: stop. 

Otherwise set s = Si and t = sf and calculate g^^^ = 

ift-fs-2h)/{xt-Xs). 

li s > k and gs-(i)s > 9'- go to Step 6. 

If t < j* and gtt+{i) > 9- go to Step 6. 

Otherwise: stop. 

Step 6. Set si = k. Delete all elements of / lying strictly between 
k and j* and then apply closejoin{l). 

Now define the vector y(S') from S and h as follows. Let q' = l^l — 1 and 
to = 1- When q' > I define ta = s^(/) when h > and ta = Sa when h = 0, 
for I < a < q'. Then let 

Pa = [ta-i, Sa] wheu I < a < q + 1, (2.46) 



and define y{S) by (2.41)-(2.42). Set y^ to the value of y(5') on exit from 
Algorithm 4. 

Step 5 is designed to avoid calculating gradients unnecessarily. The fol- 
lowing lemma will be used to justify this economy and also to show that y^ is 
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feasible. Recall the parallelogram n(/i) defined in the proof of Lemma 3, and 
given / and S, let s = Sa and t = s;^(J) and define Ilf (/i) as the parallelogram 
with vertices (x,, /,), (x„ /, + 2{-l)^~'h), {xt, ft), and {xt, ft - 2(-l)"-i/i). 
Each such parallelogram then defines a join gradient 



9^''\S,h) 



ft-fs- 2{-ir-'h 

Xi Xg 



(2.47) 



Lemma 5 Let closejoin (a) be called, modifying h' , I' , and S' to h, I, and 

S respectively. Suppose that there exists h > such that 



and that 
Then 



(x,,/,)Gnf(/.) for s'^<j<s'^il') 



h' <h. 



h<h. 



(2.48) 

(2.49) 
(2.50) 



Further, if g = g^'^>{S', h), g+ = gs'Jy{S)) and g = 9s'+{i')-i{y{S)), then 



ir-'mm{g+,g-)>{-ir-'g. 



(2.51) 



Proof Assume for simplicity that a is odd and write s' = s'^ and t' 

an- 



First consider (2.50). The proof is trivial unless closejoin increases h. In 



this case, h > and so there exist j*, k, k'^ G I such that 

2h = f,*-f,*{k,k^), (2.52) 

where s' < k < j* < k~^ < t'. Let t^ = ■s^(/). It follows from Lemma 3 
that either j* < Sa or j* > ta- The two cases are entirely similar: it suffices 
to consider the first. Define Ijj, s' < j < t', by y^, = fs' + h, y^, = ff — h, 
and Ijj = yj{s',t'), s' < j < f. Then the y^, s' < j <t', are collinear and it 



follows from (2.48) that fk > yk~ ^ &^d fk+ > yk+ — h. Therefore, since y/^, 
ijj*, and yi^+ are collinear. 



y..-h<f,.{kX)- 



It also follows from (2.48) that 



/,* < y,* + h. 



(2.53) 



(2.54) 



Addition of (2.53) and (2.54) and use of (2.52) then gives (2.50) 
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Now consider (2.51). The easiest case is when Sa > s'. Then g^ = gs'i, 
where i = s'+(/). Let the y ^, s' < j < t', be defined as above. Then fgi = 
y^i — h and it follows from (2.48) that fi >yi — h. Then gg'i > gs'iiy) = 9- 



Similarly, when ta <t\ then g < g 
The next easiest case is when s' 



9^ 



9 



= Sa and t' = ta- In this case, 
ft' - fs' - 2h 

Xf ~^ Xg' 



(2.55) 



establish (2.51) 



and it follows from (2.50) that this quantity is not less than g^ as required to 



It remains to resolve the two similar cases where s' < Sa and ta = t', 
and where s' = Sa and t„ < t'. It suffices to consider the former case and 
establish that 9 < 9~ . The procedure closejoin will add one or more indices 
s'~^, ...,i, s to /. (The gradient g~ is now the new join gradient, and the result 
required is analogous to Lemma 4, but this lemma cannot be used directly 
unless h > h', which may not be the case.) The gradients defined by the 
successive elements of I from s'^ to s will be monotonically increasing from 
gs', and it has already been shown that g < gs', so it follows that 

9 < 9^s. (2.56) 

Since ta = t', the test that allows Step 4 of Algorithm 2 to adjoin s to / will 
be the inequality 

ft'{t,s)<U-2h, 

and this inequality is equivalent to 

9is < 9, 
where g is the new join gradient given by {xf — Xs)g 
9 



ft' 



(2.57) 

(2.58) 
/, - 2h. But 



g , so from (2.56), g < g and (2.51) follows. □ 



The proof of the effectiveness of Algorithm 4 and also of its generalization 
in the next section will require an important corollary of Lemma 5. When 
new sections are added on either side of an existing section, the convexity at 
the point where they are joined increases away from zero. Thus when Steps 2 
to 6 of Algorithm 4 are applied, the constraints Ck-i, Cj*-i, and Ck+-i increase 
away from zero, so that no more than two sign changes can be created in the 
second divided differences. More precisely, when defined, these constraints 
satisfy 



Cfc+- 



-i(y^) 
-i(y^) 
-i(y^) 



> 
< 
> 



cfc-i(y°) > 0, 



Cj*. 
Ck+ 



-i(y°) 



<o, 

> 0. 



(2.59) 
(2.60) 
(2.61) 



The effectiveness of Algorithm 4 can now be established. 
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Theorem 4 Algorithm 4 produces a real number h > 0, an index set I, a 
piece set S of I, and a vector y^ = y{S) well defined by (2.46), (2.41) and 
KM such that h = h{S) = F(y2) = inf ^(Fa) and y^ e Fa- 



Proof If the Algorithm stops in Step 1, then h = h'^{l,n) = 0, / = 
I~^{l,n), and S = {n}. Then h{S) = 0. It follows from Theorem 2 that 
f G V C F2. Equation ( |2.46[ ) will set Pi = [l,n] and ( |2.4l[ ) and ( |2.42[ ) will 
set y^ = f . The theorem is then immediately established. 

Otherwise the quantities j*, k, and k^ are well-defined and satisfy the 
inequalities 1 < k < j* < k~^ < n. (Each equality is possible, for example 
when f G Y^.) It follows from Lemma 1 that at this point, 



/ = /+(!, A;) U/+(A:^ 



n 



(2.62) 



Step 2 is then entered. It inserts j* into J, increases S* to S* = {k,j*, n}, and 
calculates h = max(/i+(l. A;), /i"''(A;+,n)). Note that j*, k, and k~^ are now 
consecutive elements of /. 

Steps 3 and 4 are then executed, calling closejoin in the ranges [k,j*] and 
[j*,k^], in general increasing si, S2, and h, and adding new elements to I. 
The new elements of 5* always satisfy the inequalities k < si < j* < S2 < k~^ , 
and h cannot decrease, so h < h. Note that S2 can be increased to n, for 
example when f G Y^. 

Now consider the situation when the Algorithm stops. If h^^'^ = 0, the 
Algorithm either stops in Step 4 when h = 0, or alternatively jumps straight 
from Step 5 to Step 6 re-calling closejoin{l) with h > 0. It follows from the 
properties of Algorithm 3 that when closejoin is called with h > it cannot 
increase s to t. Therefore when the algorithm stops, if /i > then si < j* 



and S2 < k^ . Now define ti and ta as for (2.46). Then y^ is well-defined in 



all cases. It must now be shown that it is feasible and optimal. 
The first main step is to establish (2.40), i.e. 



/ = /+(l,Si)U/-(ti,S2)U/+(t2,^). 



(2.63) 



First consider the range [1, si]. Since k < si, [1, Si] = [1, k] U [k, Si\. Let Ji = 
/n[l,si]. It follows from (|2^ and Theorem 3 that Ji = /+(1, fc)U/+(A;, si). 
It is trivial that Ji = /"^(l, Si) unless 1 < k < Si. In this case there will exist 
neighbouring indices k~ and i of /c in J such that 1 < A;~ < /c < i < si < j*. 
Let Hq = /i+(l, n) and go = g^^\S, Hq). Since k~ , k, and k~^ are neighbours in 
/"•"(l, n), then g^-k < gkk+ = fi'o- Note that g^'^\S, Hq) = go. It is now possible 
to apply Lemma 5 successively with h = ho. The definition of j* allows a 
first application of the lemma with h' = h S' = S in the range [k,j*] (i.e. 
with a = 1) to infer that at entry to Step 4, h^^^ < ho- This inequality and 
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the definition of j* tlien allow a second application of tlie lemma in tlie range 
[j*,k^], wliere also g = go, to infer that h < Hq. If Step 6 is not entered, 
it follows from the first application of the lemma that g^i > go- If closejoin 
is called again in Step 6, a third application of the lemma may be made, in 
the range [k,j*], to yield that in this case also, gki > go- Then g^-k < gki- 
Lemma 2 can now be applied to prove that J+(l,/c) U I^{k,si) = /+(l,si). 
Thus in all cases 

/i=/+(l,si). (2.64) 

In the same way, if /s = / fl [t2, n], then I^ = I^{t2, n), trivially when t2 = n 
and otherwise by a single application of Lemma 5. 

Now consider the range [ti,S2]- Let /2 = / fl [ti,S2]. Since j* G /, ti 
can never exceed j* and so [ti,S2] = [ti,j*] U [j*, S2]. In all cases I2 = 
I~(ti,j*)Ul~{j*, S2), and it is trivial that I2 = /^(ti, S2) unless ti < j* < S2- 
In this case there will exist left and right neighbours of j* in I. Denote them 
by j~ and j+. Then the successive applications of Lemma 5 made above 
establish that gj-j* > go and that gj*j+ < 5^0 so that Lemma 2 again applies 



to give that I2 = /~(ti, S2)- Equation (2.63) is then established. 

The feasibility of y^ can now be proved. It is only necessary to examine 
the four (or possibly fewer) join constraints c^i-i, Ct^-i, c^j-i, and Ct2-i when 
Si < ti and S2 < t2- First consider the last two constraints. Since S2 < k~^, 
Lemma 4 shows that Cs2_i(y^) < whenever S2 exceeds j*- When S2 = j*, 
the corollary to Lemma 5 applies to show that Cs2_i(y^) < 0. Similarly, 
Lemma 4 when ^2 < k^ and the corollary to Lemma 5 when ^2 = ^^ show 
that Ct2-i(y^) > whenever ^2 < n- When t2 = n the feasibility of y^ will 
follow automatically from the signs of the other constraints. Now consider 
the constraints Csj^-i, Ct^_i. When Step 6 is entered or h^^^ = h, the same 
reasoning applies to show that Cs^_i(y^) > when si > 1 and Ci^_i(y^) > 
when ti > 1. (If f G l^o"' the Algorithm will reduce ti to 1.) When Step 6 is 
not entered and h > h^^\ Lemma 4 cannot be applied, but Lemma 5 and its 
corollary then show that feasibility is assured unless Si > k or sf < j*. In 
these cases all the gradients calculated are well-defined and the test in Step 5 
allows the algorithm to stop only when y^ is feasible. 

It is now necessary to show that -F(y^) = h- 



The construction of h and h, the inequality h < h, and (2.63) show that 
h = h{S) and \yj — fj\ < h when j G [1, Si] U [ti, S2] U [t2,n], with equality 
when j G /. For any other value of j in the range [1, n], Lemma 3 and the 
inequality h^^^ < h show that \yj — fj\ < h whether or not Step 6 is entered. 
Then F(y2) = h- 

li h = 0, there is nothing further to prove. Otherwise Sa < ta,ioT a = 1,2- 
Then it is possible to redefine j* = j*{S) and to let k, fc+, and /3 be uniquely 
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point piece 



tn = l 




Figure 6: Construction of Y5 approximation 

redefined from j*. Then j* G P/3. Let K = {k, j* , k~^ , Si,ti, S2,t2}. Then 
K cannot have fewer than five elements, even if ti = S2- Assume first that 
(3 = 1. It follows by the same argument used in the proof of Theorem 3 
that Cfc,*fc+(y^) = 0, Cj*s,tAy^) > 0, and Cs,t,t,{y^) < 0. Then if F(v) < h, 
Ckj*k+(y) < 0, Cj*siii(v) > 0, and Cs-i^tit2(y) < 0, and therefore the consecutive 
divided differences of v with indices in the subset K must change sign twice 
starting with a negative sign, so that by Theorem 1, v ^ 1^2- If /? = 2 or 
/3 = 3, the argument is similar. □ 

2.4 The general case 

The algorithm to be described constructs a best approximation to f over Yg 
from g+1 alternately convex and concave pieces joined where necessary by up 
to q straight line joins. These pieces are built up recursively from the pieces of 
a similar best approximation over Yq^2 by essentially the same method used in 
the previous section when q = 2. All the sections of the Yq-2 approximation 
remain in place except one determining the value of ini F{Yg_2) which is 
deleted and replaced with a new piece of opposite convexity to the piece 
containing this section, and two new joins. Starting with the value hoi a best 
approximation over Yq determined by the remaining sections, the procedure 
closejoin is called in each join. After this has been done, if h has increased, 
the resulting join constraints are checked and if necessary the calculation in 
each join is repeated with the new value of h. 5 The construction of y is 
illustrated in Figure 6. 

This method has the disadvantage that calculations sometimes have to 
be performed twice in each join, but each of the two sets of join calculations 
can be performed in parallel. The procedure closejoin is a generalization of 
Algorithm 2 which is a modification of the method presented in Cullinan 
& Powell (1982). That method avoids having to repeat itself by having an 
upper bound on h available throughout, but because of this does not admit 
of as much of its calculations being performed in parallel. It is therefore 
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believed that the algorithm to be presented will often be more efficient. 
Most of the notation needed for this case has already been developed. In 



particular, given a piece set S, (2.43) defines the corresponding price h{S) 



and when this price is non-zero, (2.44) and (2.45) define the index j* giving 



rise to it and the index (3 of the piece within which it lies. For any h the 
piece Pa is defined as [ta-i,Sa], where ta = s'^{I) when h > and ta = Sa 
when h = 0. Since every element of I lies in P^, for some a, the vector y(S') 
can be defined by 

y. = /. + (_)"-i/i wheniG/nP„, 1 < a < g' + 1, (2.65) 

%• = %•(/) l<J<n. (2.66) 

The following algorithm then calculates S, I, and h from which y^ is 
defined by these equations as the final value of y(S'). Each time h is increased, 
the new values of j* and /3 are calculated and recorded. To identify joins 
where a second call of closejoin will always be needed when h increases from 
zero, the index 7 will denote the lowest index for which h > after the first 
call of closejoin{'j). When h > 0, the join gradients g^"\S, h) are defined by 
( |2l7l ). 



Algorithm 5 To find a best Yq approximation. 

Step 1. Set q = q modulo 2. 

If g = 0: set 5 := {n} and use Algorithm 1 to calculate 

/ :=/+(!, n) and h = h+{l,n). 

Otherwise set S := {l,^}, / := {1,'^}, h := and call 

closejoin{l). 

Step 2. li q = q or h = 0: stop. 

Otherwise: set j := j*{S), k := j~{I), insert j into /, 

and insert j and k into S, increase g by 2 and calculate 

h:=h{S). 

Set h' := h, r := /, S' := S and 7 := 1. 

Step 3. For a = 1 to g: apply dosejoin{a), and if h has increased 
set a := a, and if /i = set 7 := a + 1. 
li h = h': return to Step 2. 
Otherwise: set a = 1 and go on to Step 4. 

Step 4. If a < 7 go to Step 5. 

If a = a return to Step 2. 
Calculate g = g^^'^S, h). 
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If s^ > s'^ and (-l)°-i(7^-(,)^^ > i-ir-'g: go to Step 5. 
Set t := s+(/') and t+ := t+(/). 

If t < s'„+(/') and (-l)"-i^ti+ > (-1)""^^: go to Step 5. 
Increase a by 1 and repeat this step. 

Step 5. Set Sa '■= s'^, delete all elements of / lying strictly between 
Sa and s'^{I'), and call dosejoin{a). 
Increase a by 1 and return to Step 4. 

In practice, in order to anticipate rounding errors, the tests whether h = 
should be whether h < 0. The applications of closejoin in each step could 
be performed simultaneously. In this case the largest of the ensuing values 
of the parameter h in Step 3 should be the value of h at entry to Step 4 and 
a should be set to n. It will be shown in the proof of the following theorem 
that h is constant during Step 4. 

Theorem 5 Algorithm 5 produces a real number h > 0, an index set I, a 



piece set S of I , and a vector y'^ = y(5') well defined by (2.65) and (2.66) 
such that h = F(y«) = mi F{Yg) and yi e Yg. 

Proof The proof that y & Yq and F{y) = h is hj induction on q. Assume 
that at any entry to Step 2 leading to Step 3, i.e. with q < q and h > 0, that 

9+1 

/= U/(-)"(t,_i,s«), (2.67) 

a=l 

where ta = ■5^(1), and that 

h = h{S). (2.68) 

The vector y(S') is then well defined. Assume that 

F{y{S)) = h (2.69) 

and that 

y(5) e Yg. (2.70) 

It will be deduced from these equations that when the algorithm termi- 
nates y(5') G Yg and that h = iniYg. 



It has been shown in Sections 2.1 and 2.2 that (2.67)-(2.69) hold at first 
entry to Step 2. 

Most of the work needed for the proof has already been done in Lem- 
mas 3 to 5. The main task is to examine the way h changes, so as to be 
able to apply these lemmas. Suppose that Step 2 begins with h = h, I = I 
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and S = S, and that it increases q to q'. Step 2 also modifies 5* from 5* 
to S' = S U {j,k} and / from / to /' = / U {j}, and recalculates h as 
h' = h{S') using (2.43). Now by (2.68) and the definition of j in Step 2, 
h = h'^(ti3-i,Si3), where (3 is defined in Step 2 by (2.45), a = (—)'', and 
tp_i = s^_i(/). By Lemma 1, I"(tfj-i, Sfj) = I^itp-i, k) U I''{k+, sp), and by 
definition, h^itp-i^sp) > max(/i°"(t/3_i. A;), /i'^(A;+, S/3)). It follows that 

h' < h. (2.71) 

Furthermore, it follows from the definitions of h and f3 when a = /3,/3 + 1 
and otherwise from (|2.67[)-(2.69) that 



x„/,)Gnf(/^) for s'^<j<s'^{I'). 



(2.72) 



Step 3 then applies dosejoin(a) in each join. Let h^""^ be the value of h af- 
ter closejoin{a) is called. Clearly the /i^"-* are monotonically non-decreasing. 



Lemma 5 can now be applied successively beginning with (2.71 ) to show that 

/i(") <h for 1 < a < g'. 

Let t'^_i = s^_i{I'). Then Lemma 5 and its corollary also show that whenever 
t'a-i < s'a < Sa, the gradients on either side of fs'^ have the correct mono- 
tonicity for Lemma 2 to yield that I"{t'^_^, s'^)Ul"{s'^, Sa) = I'^it'^^^, s^), and 
that when t,_i < t'^_^ < s'^ that r{t^_^,t'^^^) U /'^(t'^.i, s'^) = /"(ta-i, s'^), 
where a = (— )"^^. It follows either from this or otherwise trivially that in 
all cases after dosejoin{a) is called, 

/np, = r(t,_i,s,). 

Because h cannot increase further after Step 3, this holds true whether 
closejom{a) is called once only or again in Step 5, so that when Step 2 
is next entered, 

g'+l 

/= U /(-)'^(t«_i,sj. (2.73) 

The next step is to establish that when Step 2 is next entered with h > 
then h = h{S). Since h > and k < j < k~^ in Step 2, the quantity h{S) to 
which h is set by Step 2 is well defined by (2.43) with ta = s'^{I). Thus 



h' = h{S'). 

If Step 5 recalls closejoin{a) for any a, it will not increase h further, so h 
always attains its final value by the end of Step 3. It must be shown that 
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the quantity h{S) is always well defined when Step 2 is next entered. This 
can only fail to be the case if there is an a for which h^"'^ = 0. In such a case 
it must hold that h' = and that h increased during Step 3. Then a will be 
set to the index of the call of closejoin in which h achieved its final positive 
value, and the parameter 7 will be set to the lowest index for which h^^^ > 0. 
Clearly, then, 7 < a and a < 7 < a, so that Step 4 will branch to Step 5, 
recalling closejoin{a) with a positive h so that afterwards t^ = s'^{r). The 
quantity h{S) will then be well defined when Step 2 is entered. It follows 



from the form of Algorithm 2 and (2.73) that whether h increases from h' or 



not, then when Step 2 is next entered, 

h = h{S). (2.74) 

The vector y = y{S) will then be well defined at the next entry to Step 2. 



The next argument will establish that (2.69) then holds. It follows from 



(2.73) and (2.74) that it is sufficient to show that for any indices j such that 
Sa < j < ta, \yj{S) — fj\ < h. This is equivalent to the statement that the 
graph point {xj,yj{S)) lies inside the parallelogram nf(/i). It follows from 
Lemma 3 that immediately after closejoin{a) is last applied, {xj,yj{S)) lies 
strictly within Ilf (/i). If this call of closejoin comes in Step 5, h has already 
attained its value at next entry to Step 2 and the result is immediately 
established. If, on the other hand, this call of closejoin comes in Step 3, then 
(xj,yj{S)) lies inside Il^(h^°'^), and h^"^ < h. A simple calculation shows 
that if a < 6 then Ilf (a) C Ilf (6). The result then follows. Thus at next 
entry to Step 2, 

Fiy{S)) = h. (2.75) 



The next argument will re-establish (2.70) at that point. It follows from 



(2.67) that it is only necessary to establish that the join constraints Cs^_-^ and 



Qq-i of y have the correct signs when 2 < Sq, < to, < n — 1, i.e. when h > 0. 



When 1 < Sa = s'^, (2.70) and the corollary to Lemma 5 imply that 
Cs'^_i(y) had the correct sign at the last exit from Step 2 and has not now 
moved closer to zero. The same is true of Ct^_^ when t^ = t'^ < n When 
Sa > s'^, Lemma 4 applies provided that h has not increased after the last 
call of closejoin{a) to yield that (— l)°~^Cs^_i > and that when 1 < t„ < t'^ 
that (— l)"~^Q^_i < 0. If closejoin{a) is only called once and h increases 
subsequently, it must be shown that the test in Step 4 is adequate to ensure 
feasibility. In this case Step 4 will certainly be entered (because h > h') and 
Step 3 will set a and 7 so that 7 < a < «, so that /i*^"-* > 0. It follows that 
Sa < ta and so g is well defined and the correct sign of Cs^_j is assured by 
the test in Step 5. The case ta < t'^ is similar. Thus at next entry to Step 2, 
even when h = 0, 

y G n- (2-76) 



3. CONCLUSIONS 30 



Thus (2.67)-(2.70) are established by induction. It follows that when the 
algorithm terminates with a number h and the pieces from which the vector 
y is constructed, y EYg cYg and F{y) = h. 

The proof of optimality is similar to that in Theorem 4. li h = there 
is nothing to prove. Otherwise, after the algorithm has terminated, let s^, 
ta, 1 < a < g + 2, be defined as the join points that would next have been 
constructed in Step 2 if the algorithm had not terminated (i.e. by adding k, 
j*, and k~^ to the existing set of join points and reindexing.) It follows as in 
the proof of Theorem 3 that because h > 0, Sa < ta for all a. Now define 
K = {sa, ta : 1 < a < g + 2}. Because the piece [t;3,S/3+i] has only one 
element and any of the other pieces can have only one element, this set K can 
have from g + 3 to 2g + 3 elements. It has to be shown that the consecutive 
divided differences with indices in K of any vector v G IR" giving a lower 
value of F than h change sign q times starting with a negative sign, so that 
by Theorem 1, v ^Yg. 



It follows from the construction of y and (2.70) that (—1)° ^Cj_iy < 
for So < i < ta+i and so, from the corollary to Theorem 1, that for 
any a in the range I < a < q + 2, (-l)"~^Cs(a),t(a),s(a+i)(y) < and 
(-l)" ~^Cg(a),s(a+i),f(a+i)(y) < 0. If t„ = s„+i, it follows immediately from 
(|2]65| that if F{^r) < h 

( — 1)''" Cs(a),t(a),i(a+l)(v) < 0. 



Otherwise when ta < Sa+i, it follows from (2.65) that if -F(v) < h, then 
(-l)"-^Cs(a),t(a),t(a+i)(v) < and (-l)"-^C5(„),3(c,+i),i(„+i)(v) < and hence, 
from the corollary to Theorem 1, that the consecutive divided differences 
with indices in K satisfy 

(-l)""^Cs(a),t(„),s(a+l)(v) < or {-l)"'~^Ct(a),s{oi+l),tia+l)(y) < 0. 

Therefore the divided differences of v with indices in K have at least q 
sign changes starting with a negative sign, and so if v is any vector for which 
F(v) < h, then v ^ Fg. D 

3. Conclusions 

This section will describe the results of some tests of the data smoothing 
method developed in the previous section. These tests were conducted by 
contaminating sets of values of a known function with errors, applying the 
method, and then comparing the results with the exact function values. If g 
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is the vector of exact function values, a simple measure of the effectiveness 
of the method can be obtained from the quantity 



1- 



SWp, 



obtained from the £p norm. 

Algorithm 5 was trivially extended to find a best approximation over Y±q. 
It was coded in PASCAL and run on a SUN computer. The errors added to 
exact function values were either truncation or rounding errors, or uniformly 
distributed random errors in the interval [— e,e]. For each set of data the 
values of Poo and P2 were calculated. Many of the results were also graphed. 

As a preliminary test, equally spaced values of the zero function on [—5, 5] 
were contaminated with uniformly distributed errors with e = 0.1. The 
results are shown in Table 1. 



n 


Poo 


P2 


5001 


-19.37 


94.51 


501 


-68.97 


76.25 



Table 1: The zero function 



The difference between yj and zero was of the order of 10~^ for most of 
the range, but near the ends it rose to 10~^, accounting for the relatively 
high value ||y — g||2 = 0.3098 when n = 501. High end errors were frequent 
and so the value of Poo was not usually a reliable indicator of the efficiency 
of the method. 

Two main types of data were then considered: undulating data and peaky 
data, because it is these types of data that can be hard to smooth using 
divided differences unless sign changes are allowed. The first main category 
of data were obtained from equally spaced values of the function sin nx on 
the interval [—2, 2], and the second from equally spaced values of the normal 
distribution function Ns{x) = (27rx)~2 exp(— x^/2s^) with s = 0.8, on the 
interval [—5,5]. These same functions were previously used to test the £2 
data smoothing method of Cullinan(1990) which did not allow sign changes, 
where it was shown that the sine data were possible to treat well but the 
peaked data did not give very good results. 

Many of the results looked very acceptable when graphed, even when P2 
was only moderate. The results of Table 1 have already indicated that it is 
quite difficult for P2 to approach 100. This is often because of end errors, as 
there, but there is also another phenomenon that can move smoothed points 
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further away from the underlying function values. The method raises convex 
pieces and lowers concave pieces, and this often results in points near an 
extremum of the underlying function being pulled away from it, for example 
if there is a large error on the low side of a maximum. 

Different sets of random errors with the same e can give very different 
values of both Poo and P2, particularly when the spacing between points is 
not very small, and in fact it was found that reducing the spacing beyond a 
certain point can make a great difference. 

The method coped quite well with the sine data and was very good with 
the peaked data, and a great improvement on the method in Cullinan(1990), 
managing to model both the peak and the flat tail well. The results of one 
run are shown in Figure 7. 



N data eps=0.01 n=101 q=2 P2 =45.8 




Figure 7: Peaked Data with n = 1, q = 2, e = 0.01, P2 = 45.8 

The method is very economical, and can give very good results, particu- 
larly when the data are close together. Thus it seems a good candidate for 
large densely packed data, even when the errors are relatively large. BIBLI- 
OGRAPHY 
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